This Lab is about linear Regularized
Least Squares. Follow the instructions below. Think hard before you
call the instructors!
Download:
zipfile (unzip it in a local folder)
Set Matlab path to include the local folder
We start again from data generation, using the function MixGauss:
1.A Generate a 2-class training set where each class is
centered in (-1;-1) and (1;1) and its standard deviation is 0.35 and
0.35, respectively. Generate 100 points per class: input Xtr,
output Ytr.
Adjust the
output labels so to assign them {-1,1}: Ytr(Ytr==2)=-1.
1.B Generate the corresponding test set of 300 points per class: input Xts, output Yts; Remember to assign {-1,1} to the test labels, consistently with the training set.
1.C Add some noise to the previously generated data, with the function flipLabels (type "help flipLabels" for some guideline). Create a new set of noisy training and test output vectors: Ytrn, Ytsn.
Plot the various datasets with the function scatter,
e.g.:
figure;
hold
on
scatter(Xtr(Ytr==1,1), Xtr(Ytr==1,2),
'.r');
scatter(Xtr(Ytr==-1,1), Xtr(Ytr==-1,2),
'.b');
title('training set')
hold
off
2.A Have a look at the code of functions regularizedLSTrain and regularizedLSTest, and complete them.
2.B Try the
functions on the previously generated 2-class data from section 1.
Pick a "reasonable" lambda.
Store the predictions in
the Ypred
vector.
2.C Think of how to plot the data to get a glimpse
of the obtained results. A possible way
is:
figure;
scatter(Xts(:,1),Xts(:,2),25,Ytsn);
hold
on
sel = (sign(Ypred) ~=
Ytsn);
scatter(Xts(sel,1),Xts(sel,2),200,Ytsn(sel),'x');
hold
off
2.D To evaluate the classification performance, compare the predicted outputs with those previously generated:
2.E
To visualize the separating function (and thus get a more general
view of what areas are associated with each class) you may use the
routine separatingFRLS
(type "help separatingFRLS"
on the Matlab shell, if you still have doubts on how to use it, have
a look at the code).
Superimpose to the separating function both
training data (Xtr,
Ytrn) and test
data (Xts,
Ytsn), in two
separate plots, to analyze the generalization properties of your
solution.
3.A Repeat all
the experiments in section 2 for different data-sets (e.g. vary
training set size, mean position, percentage of flipping, shape)
3.B Think of how to
generate a problem where the Gaussians live in a 200 dimensional
space (clearly you will not be able to plot this data-set...)
3.C Consider a high dimensional data-set. For example produce two gaussians with centers [1; zeros(199,1)] and [-1; zeros(199,1)], standard deviations [0.5 , 0.5] and 50 points per class. Check what happens with varying lambda. How would you evaluate the quality of the solution?
3.D Generate a training set and a test set using AnisotropicMixGauss(mean1, Sigma1, mean2, Sigma2) with means [-2;0] and [2;0] and Sigma1 equal to Sigma2 = [1, -1; -1, 0.1]; Check the impact of lambda.
3.E
Modify the
regularizedLSTrain
and regularizedLSTest
functions to incorporate an
off-set in the linear model. Compare the solution with and without
offset, in a 2-class data set where each class is centered on (0,0)
and (1,1) respectively, with standard
deviations 0.35 and 0.35
3.F Modify the regularizedLSTrain and regularizedLSTest functions to handle multiclass problems.